A unified approach to the applicability domain problem of QSAR models

نویسندگان

  • Dragos Horvath
  • Gilles Marcou
  • Alexandre Varnek
چکیده

The present work proposes a unified conceptual framework to describe and quantify the important issue of the Applicability Domains (AD) of Quantitative StructureActivity Relationships (QSARs). AD models are conceived as meta-models designed to associate an untrustworthiness score to any molecule M subject to property prediction by a QSAR model. Untrustworthiness scores or “AD metrics” are an expression of the relationship between M (represented by its descriptors in chemical space) and the space zones populated by the training molecules at the basis of model μ. Scores integrating some of the classical AD criteria (similarity-based, boxbased) were considered in addition to newly invented terms, such as the dissimilarity to outlier-free training sets and the correlation breakdown count. A loose correlation is expected to exist between this untrustworthiness and the error affecting the predicted property. While high untrustworthiness does not preclude correct predictions, inaccurate predictions at low untrustworthiness must be imperatively avoided. This kind of relationship is characteristic for the Neighborhood Behavior (NB) problem: dissimilar molecule pairs may or may not display similar properties, but similar molecule pairs with different properties are explicitly “forbidden”. Therefore, statistical tools developed to tackle this latter aspect were applied, and lead to a unified AD metric benchmarking scheme. A first use of untrustworthiness scores resides in prioritization of predictions, without need to specify a hard AD border. Moreover, if a significant set of external compounds is available, the formalism allows optimal AD borderlines to be fitted. Eventually, consensus AD definitions were built by means of a nonparametric mixing scheme of two AD metrics of comparable quality, and shown to outperform their respective parents.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

QSAR Modeling of COX-2 Inhibitory Activity of Some Dihydropyridine and Hydroquinoline Derivatives Using Multiple Linear Regression (MLR) Method

COX-2 inhibitory activities of some 1,4-dihydropyridine and 5-oxo-1,4,5,6,7,8-hexahydroquinoline derivatives were modeled by quantitative structure–activity relationship (QSAR) using stepwise-multiple linear regression (SW-MLR) method. The built model was robust and predictive with correlation coefficient (R2) of 0.972 and 0.531 for training and test groups, respectively. The quality of the mod...

متن کامل

Quantitative Structure Activity Relationship Analysis of Coumarins as Free Radical Scavengers by Genetic Function Algorithm

The antioxidant properties of coumarin derivatives using the 2,2ˈ -diphenyl-1- picrylhydrazyl (DPPH) radical scavenging assay were investigated by the application of Quantitative Structure Activity Relationship (QSAR) studies. The molecular structures were optimized and submitted for the generation of quantum chemical and molecular descriptors. Genetic Function Algorithm (GFA) was employed in m...

متن کامل

QSAR Modeling of COX-2 Inhibitory Activity of Some Dihydropyridine and Hydroquinoline Derivatives Using Multiple Linear Regression (MLR) Method

COX-2 inhibitory activities of some 1,4-dihydropyridine and 5-oxo-1,4,5,6,7,8-hexahydroquinoline derivatives were modeled by quantitative structure–activity relationship (QSAR) using stepwise-multiple linear regression (SW-MLR) method. The built model was robust and predictive with correlation coefficient (R2) of 0.972 and 0.531 for training and test groups, respectively. The quality of the mod...

متن کامل

In-silico prediction of Cellular Responses to Polymeric Biomaterials from Their Molecular Descriptors

In this work quantitative structure activity relationship (QSAR) methodology was applied for modeling and prediction of cellular response to polymers that have been designed for tissue engineering. After calculation and screening of molecular descriptors, linear and nonlinear models were developed by using multiple linear regressions (MLR) and artificial neural network (ANN) methods. The root m...

متن کامل

QSAR Study of 17β-HSD3 Inhibitors by Genetic Algorithm-Support Vector Machine as a Target Receptor for the Treatment of Prostate Cancer

The 17β-HSD3 enzyme plays a key role in treatment of prostate cancer and small inhibitorscan be used to efficiently target it. In the present study, the multiple linear regression (MLR),and support vector machine (SVM) methods were used to interpret the chemical structuralfunctionality against the inhibition activity of some 17β-HSD3inhibitors. Chemical structuralinformation were described thro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2010